Skip to content

fix(mothership): tool call loop#3729

Merged
Sg312 merged 63 commits intostagingfrom
improvement/tool-call-loop
Mar 24, 2026
Merged

fix(mothership): tool call loop#3729
Sg312 merged 63 commits intostagingfrom
improvement/tool-call-loop

Conversation

@Sg312
Copy link
Collaborator

@Sg312 Sg312 commented Mar 24, 2026

Summary

Rewrite mothership tool call loop

Type of Change

  • Bug fix

Testing

Manual

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

waleedlatif1 and others added 30 commits February 16, 2026 00:36
…stash, algolia tools; isolated-vm robustness improvements, tables backend (#3271)

* feat(tools): advanced fields for youtube, vercel; added cloudflare and dataverse tools (#3257)

* refactor(vercel): mark optional fields as advanced mode

Move optional/power-user fields behind the advanced toggle:
- List Deployments: project filter, target, state
- Create Deployment: project ID override, redeploy from, target
- List Projects: search
- Create/Update Project: framework, build/output/install commands
- Env Vars: variable type
- Webhooks: project IDs filter
- Checks: path, details URL
- Team Members: role filter
- All operations: team ID scope

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style(youtube): mark optional params as advanced mode

Hide pagination, sort order, and filter fields behind the advanced
toggle for a cleaner default UX across all YouTube operations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* added advanced fields for vercel and youtube, added cloudflare and dataverse block

* addded desc for dataverse

* add more tools

* ack comment

* more

* ops

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat(tables): added tables (#2867)

* updates

* required

* trashy table viewer

* updates

* updates

* filtering ui

* updates

* updates

* updates

* one input mode

* format

* fix lints

* improved errors

* updates

* updates

* chages

* doc strings

* breaking down file

* update comments with ai

* updates

* comments

* changes

* revert

* updates

* dedupe

* updates

* updates

* updates

* refactoring

* renames & refactors

* refactoring

* updates

* undo

* update db

* wand

* updates

* fix comments

* fixes

* simplify comments

* u[dates

* renames

* better comments

* validation

* updates

* updates

* updates

* fix sorting

* fix appearnce

* updating prompt to make it user sort

* rm

* updates

* rename

* comments

* clean comments

* simplicifcaiton

* updates

* updates

* refactor

* reduced type confusion

* undo

* rename

* undo changes

* undo

* simplify

* updates

* updates

* revert

* updates

* db updates

* type fix

* fix

* fix error handling

* updates

* docs

* docs

* updates

* rename

* dedupe

* revert

* uncook

* updates

* fix

* fix

* fix

* fix

* prepare merge

* readd migrations

* add back missed code

* migrate enrichment logic to general abstraction

* address bugbot concerns

* adhere to size limits for tables

* remove conflicting migration

* add back migrations

* fix tables auth

* fix permissive auth

* fix lint

* reran migrations

* migrate to use tanstack query for all server state

* update table-selector

* update names

* added tables to permission groups, updated subblock types

---------

Co-authored-by: Vikhyath Mondreti <vikhyath@simstudio.ai>
Co-authored-by: waleed <walif6@gmail.com>

* fix(snapshot): changed insert to upsert when concurrent identical child workflows are running (#3259)

* fix(snapshot): changed insert to upsert when concurrent identical child workflows are running

* fixed ci tests failing

* fix(workflows): disallow duplicate workflow names at the same folder level (#3260)

* feat(tools): added redis, upstash, algolia, and revenuecat (#3261)

* feat(tools): added redis, upstash, algolia, and revenuecat

* ack comment

* feat(models): add gemini-3.1-pro-preview and update gemini-3-pro thinking levels (#3263)

* fix(audit-log): lazily resolve actor name/email when missing (#3262)

* fix(blocks): move type coercions from tools.config.tool to tools.config.params (#3264)

* fix(blocks): move type coercions from tools.config.tool to tools.config.params

Number() coercions in tools.config.tool ran at serialization time before
variable resolution, destroying dynamic references like <block.result.count>
by converting them to NaN/null. Moved all coercions to tools.config.params
which runs at execution time after variables are resolved.

Fixed in 15 blocks: exa, arxiv, sentry, incidentio, wikipedia, ahrefs,
posthog, elasticsearch, dropbox, hunter, lemlist, spotify, youtube, grafana,
parallel. Also added mode: 'advanced' to optional exa fields.

Closes #3258

* fix(blocks): address PR review — move remaining param mutations from tool() to params()

- Moved field mappings from tool() to params() in grafana, posthog,
  lemlist, spotify, dropbox (same dynamic reference bug)
- Fixed parallel.ts excerpts/full_content boolean logic
- Fixed parallel.ts search_queries empty case (must set undefined)
- Fixed elasticsearch.ts timeout not included when already ends with 's'
- Restored dropbox.ts tool() switch for proper default fallback

* fix(blocks): restore field renames to tool() for serialization-time validation

Field renames (e.g. personalApiKey→apiKey) must be in tool() because
validateRequiredFieldsBeforeExecution calls selectToolId()→tool() then
checks renamed field names on params. Only type coercions (Number(),
boolean) stay in params() to avoid destroying dynamic variable references.

* improvement(resolver): resovled empty sentinel to not pass through unexecuted valid refs to text inputs (#3266)

* fix(blocks): add required constraint for serviceDeskId in JSM block (#3268)

* fix(blocks): add required constraint for serviceDeskId in JSM block

* fix(blocks): rename custom field values to request field values in JSM create request

* fix(trigger): add isolated-vm support to trigger.dev container builds (#3269)

Scheduled workflow executions running in trigger.dev containers were
failing to spawn isolated-vm workers because the native module wasn't
available in the container. This caused loop condition evaluation to
silently fail and exit after one iteration.

- Add isolated-vm to build.external and additionalPackages in trigger config
- Include isolated-vm-worker.cjs via additionalFiles for child process spawning
- Add fallback path resolution for worker file in trigger.dev environment

* fix(tables): hide tables from sidebar and block registry (#3270)

* fix(tables): hide tables from sidebar and block registry

* fix(trigger): add isolated-vm support to trigger.dev container builds (#3269)

Scheduled workflow executions running in trigger.dev containers were
failing to spawn isolated-vm workers because the native module wasn't
available in the container. This caused loop condition evaluation to
silently fail and exit after one iteration.

- Add isolated-vm to build.external and additionalPackages in trigger config
- Include isolated-vm-worker.cjs via additionalFiles for child process spawning
- Add fallback path resolution for worker file in trigger.dev environment

* lint

* fix(trigger): update node version to align with main app (#3272)

* fix(build): fix corrupted sticky disk cache on blacksmith (#3273)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Lakee Sivaraya <71339072+lakeesiv@users.noreply.github.com>
Co-authored-by: Vikhyath Mondreti <vikhyath@simstudio.ai>
Co-authored-by: Vikhyath Mondreti <vikhyathvikku@gmail.com>
… fixes, removed retired models, hex integration
…ogle tasks and bigquery integrations, workflow lock
@vercel
Copy link

vercel bot commented Mar 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Mar 24, 2026 1:06am

Request Review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 24, 2026

Greptile Summary

This PR fixes a tool-call loop in the Mothership/Copilot orchestrator by replacing Redis polling with an in-process pub-sub event system and adding a claim-based delivery mechanism to prevent async tool results from being processed more than once.

Key changes:

  • Tool call loop fix: Replaces the old waitForToolDecision / waitForToolCompletion Redis polling loops with waitForToolConfirmation (pub-sub, DB as source of truth). Removes the now-unnecessary promptForToolApproval option and its entire code path from handlers.ts and the OrchestratorOptions type.
  • Claim-based async delivery: The orchestrator now atomically claims completed async tool rows (WHERE claimedBy IS NULL) before resuming the Go stream, marks them delivered on success, and releases the claim on failure — preventing the double-resume that caused the original loop.
  • New delivered enum value: Added to copilot_async_tool_status with a safe multi-step Postgres migration.
  • Abort signal propagation: cancelRunToolExecution creates and aborts an AbortController whose signal is passed into executeWorkflowWithFullLogging, giving proper cancellation of in-flight workflow runs.
  • buildToolCallSummaries fix: Pending/executing tools without a result no longer collapse to success, fixing incorrect status reporting for subagent tools.
  • PPTX write validation (workspace-file.ts): Before storing LLM-generated PptxGenJS code, the server now runs it to verify it produces a valid PPTX — but the validation uses an unsandboxed new Function() in the main process instead of the existing sandboxed subprocess in @/lib/execution/pptx-vm. This is a security regression that needs to be addressed before merge.
  • Several new focused unit tests cover the async continuation path, claim semantics, pub-sub wakeup, stream replay cancellation, and abort signal passthrough.

Confidence Score: 2/5

  • Not safe to merge as-is due to a server-side code injection vulnerability introduced in the PPTX validation path.
  • The core orchestration fix (claim-based async delivery, pub-sub confirmation, abort propagation) is well-designed and thoroughly tested. However, workspace-file.ts introduces an unsandboxed new Function() call to validate LLM-generated PPTX code in the main server process, bypassing the existing sandboxed subprocess executor in pptx-vm.ts. This is a direct server-side code execution vulnerability. The fix is a one-line import change, but it must happen before merge.
  • apps/sim/lib/copilot/tools/server/files/workspace-file.ts — the generatePptxFromCode function must be replaced with the import from @/lib/execution/pptx-vm.

Important Files Changed

Filename Overview
apps/sim/lib/copilot/tools/server/files/workspace-file.ts Adds server-side PPTX validation via a locally-defined new Function() executor, bypassing the sandboxed subprocess in pptx-vm.ts. Critical security regression.
apps/sim/lib/copilot/orchestrator/index.ts Core fix: replaces sequential async continuation with a claim-based loop that prevents double-delivery of async tool results. Adds markAsyncToolDelivered and releaseCompletedAsyncToolClaim for correct lifecycle management. The return removal from the done event handler is intentional and tested.
apps/sim/lib/copilot/orchestrator/persistence.ts Replaces Redis polling with an in-process pub-sub channel for tool confirmations. New waitForToolConfirmation uses a double-check pattern (before and after subscribe) to correctly handle the race window. Well-tested.
apps/sim/lib/copilot/orchestrator/sse/handlers/handlers.ts Removes promptForToolApproval path entirely and replaces fire-and-forget upsert with a properly awaited upsert (with error logging) before tool execution. Workflow run tools are now correctly forced to the client path.
apps/sim/lib/copilot/async-runs/repository.ts Adds getAsyncToolCall, getRunSegment, markAsyncToolDelivered, releaseCompletedAsyncToolClaim. claimCompletedAsyncToolCall uses an atomic WHERE claimedBy IS NULL predicate to prevent double-claiming. runId is NOT NULL in the DB schema so the null-pass concern in getRunSegment is safe.
apps/sim/app/api/copilot/confirm/route.ts Replaces Redis-based confirmation with durable DB upsert + pub-sub wakeup. Adds ownership check (run.userId !== authenticatedUserId) to prevent cross-user confirmation. existing.runId is always non-null per the DB NOT NULL constraint.
apps/sim/lib/copilot/async-runs/lifecycle.ts New module centralising async tool status constants and lifecycle predicates. Clean, well-tested helpers for isTerminalAsyncStatus, isDeliveredAsyncStatus, and inferDeliveredAsyncSuccess.
apps/sim/lib/copilot/client-sse/run-tool-execution.ts Adds AbortController per workflow execution and exposes cancelRunToolExecution. markRunToolManuallyStopped now returns the toolCallId so callers can pass an explicit override to reportManualRunToolStop, fixing a potential race between map deletion and the report call.
apps/sim/lib/copilot/orchestrator/stream/core.ts Fixes buildToolCallSummaries to no longer collapse pending/executing tools to success when no result exists. Tools without a result now keep their actual status, preventing false success reporting.
packages/db/migrations/0180_amused_marvel_boy.sql Adds delivered to the copilot_async_tool_status enum. The migration correctly converts the column to text, drops and recreates the enum, then restores the typed column — safe migration pattern for enum extension in Postgres.

Sequence Diagram

sequenceDiagram
    participant Client as Browser Client
    participant ConfirmAPI as /api/copilot/confirm
    participant DB as Postgres DB
    participant PubSub as In-Process PubSub
    participant Orchestrator as Orchestrator (index.ts)
    participant GoAPI as Go /api/tools/resume

    Client->>ConfirmAPI: POST {toolCallId, status}
    ConfirmAPI->>DB: getAsyncToolCall(toolCallId)
    DB-->>ConfirmAPI: existing row (runId NOT NULL)
    ConfirmAPI->>DB: getRunSegment(existing.runId)
    DB-->>ConfirmAPI: run row (userId check)
    ConfirmAPI->>DB: completeAsyncToolCall / upsertAsyncToolCall
    ConfirmAPI->>PubSub: publishToolConfirmation(event)
    ConfirmAPI-->>Client: 200 OK

    Note over Orchestrator: waitForToolCompletion() is parked
    PubSub-->>Orchestrator: event arrives
    Orchestrator->>DB: getToolConfirmation(toolCallId)
    DB-->>Orchestrator: terminal status → settle

    Orchestrator->>DB: claimCompletedAsyncToolCall(toolCallId, workerId)<br/>(WHERE claimedBy IS NULL)
    DB-->>Orchestrator: claimed row
    Orchestrator->>DB: getAsyncToolCalls([toolCallId])
    DB-->>Orchestrator: durable result

    Orchestrator->>GoAPI: POST /api/tools/resume {checkpointId, results}
    GoAPI-->>Orchestrator: stream response

    Orchestrator->>DB: markAsyncToolDelivered(toolCallId)
    Note over Orchestrator: claimedToolCallIds cleared
Loading

Reviews (1): Last reviewed commit: "Add delegating state to subagents" | Re-trigger Greptile

@Sg312 Sg312 merged commit 775daed into staging Mar 24, 2026
11 checks passed
@waleedlatif1 waleedlatif1 deleted the improvement/tool-call-loop branch March 24, 2026 02:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants